Summarising text with a genetic algorithm-based sentence extraction

نویسندگان

  • Vahed Qazvinian
  • Leila Sharif Hassanabadi
  • Ramin Halavati
چکیده

Automatic text summarisation has long been studied and used. The growth in the amount of information on the web results in more demands for automatic methods for text summarisation. Designing a system to produce human-quality summaries is difficult and therefore, many researchers have focused on sentence or paragraph extraction, which is a kind of summarisation. In this paper, we introduce a new method to make such extracts. GeneticAlgorithm (GA)-based sentence selection is used to make a summary, and once the summary is created, it is evaluated using a fitness function. The fitness function is based on three following factors: Readability Factor (RF), Cohesion Factor (CF) and Topic-Relation Factor (TRF). In this paper, we introduce these factors and discuss the Genetic Algorithm with the specific fitness function. Evaluation results are also shown and discussed in the paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

Multi-document Summarization System: Using Fuzzy Logic and Genetic Algorithm

In the recent times, the requirement for generation of multi-document summary has gained a lot of attention among the researchers. Mostly, the text summarization technique uses the sentence extraction technique where the salient sentences in the multiple documents are extracted and presented as a summary. In our proposed system, we have developed a sentence extraction based automatic multi-docu...

متن کامل

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

Msc in Speech and Language Processing Dissertation : Automatic Summarising Based on Sentence Extraction: a Statistical Approach

The present dissertation and project describes a system for automatic summarising of texts. Instead of generating abstracts, a hard NLP task of questionable e ectiveness, the system tries to identify the most important sentences of the original text, thus producing an extract. The proposed, corpus-based and statistical approach exploits several heuristics to determine the summary-worthiness of ...

متن کامل

Towards an ANN-based Approach to Automatic Sentence Extraction of the Chinese Text

We propose an ANN based automatic sentence extraction approach in this paper. We discuss in detail how to select the features of a sentence and we also present the algorithms to compute the feature values. The experiment results show that the this approach is feasible in implementing an automatic Chinese text abstracting system.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008